6 research outputs found

    Identifying Correlated Heavy-Hitters in a Two-Dimensional Data Stream

    Full text link
    We consider online mining of correlated heavy-hitters from a data stream. Given a stream of two-dimensional data, a correlated aggregate query first extracts a substream by applying a predicate along a primary dimension, and then computes an aggregate along a secondary dimension. Prior work on identifying heavy-hitters in streams has almost exclusively focused on identifying heavy-hitters on a single dimensional stream, and these yield little insight into the properties of heavy-hitters along other dimensions. In typical applications however, an analyst is interested not only in identifying heavy-hitters, but also in understanding further properties such as: what other items appear frequently along with a heavy-hitter, or what is the frequency distribution of items that appear along with the heavy-hitters. We consider queries of the following form: In a stream S of (x, y) tuples, on the substream H of all x values that are heavy-hitters, maintain those y values that occur frequently with the x values in H. We call this problem as Correlated Heavy-Hitters (CHH). We formulate an approximate formulation of CHH identification, and present an algorithm for tracking CHHs on a data stream. The algorithm is easy to implement and uses workspace which is orders of magnitude smaller than the stream itself. We present provable guarantees on the maximum error, as well as detailed experimental results that demonstrate the space-accuracy trade-off

    Detecting exploit patterns from network packet streams

    Get PDF
    Network-based Intrusion Detection Systems (NIDS), e.g., Snort, Bro or NSM, try to detect malicious network activity such as Denial of Service (DoS) attacks and port scans by monitoring network traffic. Research from network traffic measurement has identified various patterns that exploits on today's Internet typically exhibit. However, there has not been any significant attempt, so far, to design algorithms with provable guarantees for detecting exploit patterns from network traffic packets. In this work, we develop and apply data streaming algorithms to detect exploit patterns from network packet streams. In network intrusion detection, it is necessary to analyze large volumes of data in an online fashion. Our work addresses scalable analysis of data under the following situations. (1) Attack traffic can be stealthy in nature, which means detecting a few covert attackers might call for checking traffic logs of days or even months, (2) Traffic is multidimensional and correlations between multiple dimensions maybe important, and (3) Sometimes traffic from multiple sources may need to be analyzed in a combined manner. Our algorithms offer provable bounds on resource consumption and approximation error. Our theoretical results are supported by experiments over real network traces and synthetic datasets.</p

    CPRE 546X Spring Semester 2007 Wireless Sensor Networks A Survey on Tracking Mobile Objects in Sensor Networks

    No full text
    The classical problem of tracking the location of mobile objects in a wireless sensor network is one that has drawn considerable attention from the sensor network research community for quite a long time, as this problem provides an abstraction for many applications that can be built over wireless sensor networks. However, many technical issues still exist in this field. In order to provide a better understanding of the research challenges of this problem, this article presents a detailed investigation of current state-of-the-art protocols and algorithms that have been proposed for this problem. Finally, this survey proposes a distributed data structure and an algorithm to support the find and update operations, based on the Arrow protocol, whose respective goals are to return the location of a mobile object, which periodically advertises its presence by broadcasting a BEACON frame, to a user who issues a query, and update the data structure whenever the object moves.

    Identifying correlated heavy-hitters in a two-dimensional data stream

    No full text
    We consider online mining of correlated heavy-hitters (CHH) from a data stream. Given a stream of two-dimensional data, a correlated aggregate query first extracts a substream by applying a predicate along a primary dimension, and then computes an aggregate along a secondary dimension. Prior work on identifying heavy-hitters in streams has almost exclusively focused on identifying heavy-hitters on a single dimensional stream, and these yield little insight into the properties of heavy-hitters along other dimensions. In typical applications however, an analyst is interested not only in identifying heavy-hitters, but also in understanding further properties such as: what other items appear frequently along with a heavy-hitter, or what is the frequency distribution of items that appear along with the heavy-hitters. We consider queries of the following form: “In a stream S of (x, y) tuples, on the substream H of all x values that are heavy-hitters, maintain those y values that occur frequently with the x values in H”. We call this problem as CHH. We formulate an approximate formulation of CHH identification, and present an algorithm for tracking CHHs on a data stream. The algorithm is easy to implement and uses workspace much smaller than the stream itself. We present provable guarantees on the maximum error, as well as detailed experimental results that demonstrate the space-accuracy trade-off.This is an Accepted Manuscript of an article published by Taylor & Francis as Lahiri, Bibudh, Arko Provo Mukherjee, and Srikanta Tirthapura. "Identifying correlated heavy-hitters in a two-dimensional data stream." Data Mining and Knowledge Discovery 30, no. 4 (2016): 797-818. Available online: https://doi.org/10.1007/s10618-015-0438-6. Posted with permission.</p
    corecore